Washoe County
The Download: the desert data center boom, and how to measure Earth's elevations
In the high desert east of Reno, Nevada, construction crews are flattening the golden foothills of the Virginia Range, laying the foundations of a data center city. Google, Tract, Switch, EdgeCore, Novva, Vantage, and PowerHouse are all operating, building, or expanding huge facilities nearby. Meanwhile, Microsoft has acquired more than 225 acres of undeveloped property, and Apple is expanding its existing data center just across the Truckee River from the industrial park. The corporate race to amass computing resources to train and run artificial intelligence models and store information in the cloud has sparked a data center boom in the desert--and it's just far enough away from Nevada's communities to elude wide notice and, some fear, adequate scrutiny. This story is part of Power Hungry: AI and our energy future--our new series shining a light on the energy demands and carbon costs of the artificial intelligence revolution.
Verification and Validation of a Vision-Based Landing System for Autonomous VTOL Air Taxis
Bansal, Ayoosh, Wang, Duo, Yeghiazaryan, Mikael, Li, Yangge, Tao, Chuyuan, Yoon, Hyung-Jin, Arora, Prateek, Papachristos, Christos, Voulgaris, Petros, Mitra, Sayan, Sha, Lui, Hovakimyan, Naira
Autonomous air taxis are poised to revolutionize urban mass transportation, however, ensuring their safety and reliability remains an open challenge. Validating autonomy solutions on air taxis in the real world presents complexities, risks, and costs that further convolute this challenge. Verification and Validation (V&V) frameworks play a crucial role in the design and development of highly reliable systems by formally verifying safety properties and validating algorithm behavior across diverse operational scenarios. Advancements in high-fidelity simulators have significantly enhanced their capability to emulate real-world conditions, encouraging their use for validating autonomous air taxi solutions, especially during early development stages. This evolution underscores the growing importance of simulation environments, not only as complementary tools to real-world testing but as essential platforms for evaluating algorithms in a controlled, reproducible, and scalable manner. This work presents a V&V framework for a vision-based landing system for air taxis with vertical take-off and landing (VTOL) capabilities. Specifically, we use Verse, a tool for formal verification, to model and verify the safety of the system by obtaining and analyzing the reachable sets. To conduct this analysis, we utilize a photorealistic simulation environment. The simulation environment, built on Unreal Engine, provides realistic terrain, weather, and sensor characteristics to emulate real-world conditions with high fidelity. To validate the safety analysis results, we conduct extensive scenario-based testing to assess the reachability set and robustness of the landing algorithm in various conditions. This approach showcases the representativeness of high-fidelity simulators, offering an effective means to analyze and refine algorithms before real-world deployment.
Data-Driven Graph Switching for Cyber-Resilient Control in Microgrids
Distributed microgrids are conventionally dependent on communication networks to achieve secondary control objectives. This dependence makes them vulnerable to stealth data integrity attacks (DIAs) where adversaries may perform manipulations via infected transmitters and repeaters to jeopardize stability. This paper presents a physics-guided, supervised Artificial Neural Network (ANN)-based framework that identifies communication-level cyberattacks in microgrids by analyzing whether incoming measurements will cause abnormal behavior of the secondary control layer. If abnormalities are detected, an iteration through possible spanning tree graph topologies that can be used to fulfill secondary control objectives is done. Then, a communication network topology that would not create secondary control abnormalities is identified and enforced for maximum stability. By altering the communication graph topology, the framework eliminates the dependence of the secondary control layer on inputs from compromised cyber devices helping it achieve resilience without instability. Several case studies are provided showcasing the robustness of the framework against False Data Injections and repeater-level Man-in-the-Middle attacks. To understand practical feasibility, robustness is also verified against larger microgrid sizes and in the presence of varying noise levels. Our findings indicate that performance can be affected when attempting scalability in the presence of noise. However, the framework operates robustly in low-noise settings.
WTU-EVAL: A Whether-or-Not Tool Usage Evaluation Benchmark for Large Language Models
Ning, Kangyun, Su, Yisong, Lv, Xueqiang, Zhang, Yuanzhe, Liu, Jian, Liu, Kang, Xu, Jinan
Although Large Language Models (LLMs) excel in NLP tasks, they still need external tools to extend their ability. Current research on tool learning with LLMs often assumes mandatory tool use, which does not always align with real-world situations, where the necessity for tools is uncertain, and incorrect or unnecessary use of tools can damage the general abilities of LLMs. Therefore, we propose to explore whether LLMs can discern their ability boundaries and use tools flexibly. We then introduce the Whether-or-not tool usage Evaluation benchmark (WTU-Eval) to assess LLMs with eleven datasets, where six of them are tool-usage datasets, and five are general datasets. LLMs are prompted to use tools according to their needs. The results of eight LLMs on WTU-Eval reveal that LLMs frequently struggle to determine tool use in general datasets, and LLMs' performance in tool-usage datasets improves when their ability is similar to ChatGPT. In both datasets, incorrect tool usage significantly impairs LLMs' performance. To mitigate this, we also develop the finetuning dataset to enhance tool decision-making. Fine-tuning Llama2-7B results in a 14\% average performance improvement and a 16.8\% decrease in incorrect tool usage. We will release the WTU-Eval benchmark.
Through the Clutter: Exploring the Impact of Complex Environments on the Legibility of Robot Motion
Schmidt-Wolf, Melanie, Becker, Tyler, Oliva, Denielle, Nicolescu, Monica, Feil-Seifer, David
The environments in which the collaboration of a robot would be the most helpful to a person are frequently uncontrolled and cluttered with many objects present. Legible robot arm motion is crucial in tasks like these in order to avoid possible collisions, improve the workflow and help ensure the safety of the person. Prior work in this area, however, focuses on solutions that are tested only in uncluttered environments and there are not many results taken from cluttered environments. In this research we present a measure for clutteredness based on an entropic measure of the environment, and a novel motion planner based on potential fields. Both our measures and the planner were tested in a cluttered environment meant to represent a more typical tool sorting task for which the person would collaborate with a robot. The in-person validation study with Baxter robots shows a significant improvement in legibility of our proposed legible motion planner compared to the current state-of-the-art legible motion planner in cluttered environments. Further, the results show a significant difference in the performance of the planners in cluttered and uncluttered environments, and the need to further explore legible motion in cluttered environments. We argue that the inconsistency of our results in cluttered environments with those obtained from uncluttered environments points out several important issues with the current research performed in the area of legible motion planners.
WIP: A Unit Testing Framework for Self-Guided Personalized Online Robotics Learning
Shill, Ponkoj Chandra, Feil-Seifer, David, Ruiz, Jiullian-Lee Vargas, Wu, Rui
Our ongoing development and deployment of an online robotics education platform highlighted a gap in providing an interactive, feedback-rich learning environment essential for mastering programming concepts in robotics, which they were not getting with the traditional code-simulate-turn in workflow. Since teaching resources are limited, students would benefit from feedback in real-time to find and fix their mistakes in the programming assignments. To address these concerns, this paper will focus on creating a system for unit testing while integrating it into the course workflow. We facilitate this real-time feedback by including unit testing in the design of programming assignments so students can understand and fix their errors on their own and without the prior help of instructors/TAs serving as a bottleneck. In line with the framework's personalized student-centered approach, this method makes it easier for students to revise, and debug their programming work, encouraging hands-on learning. The course workflow updated to include unit tests will strengthen the learning environment and make it more interactive so that students can learn how to program robots in a self-guided fashion.
Communication-Efficient Training Workload Balancing for Decentralized Multi-Agent Learning
Mohammadabadi, Seyed Mahmoud Sajjadi, Yang, Lei, Yan, Feng, Zhang, Junshan
Decentralized Multi-agent Learning (DML) enables collaborative model training while preserving data privacy. However, inherent heterogeneity in agents' resources (computation, communication, and task size) may lead to substantial variations in training time. This heterogeneity creates a bottleneck, lengthening the overall training time due to straggler effects and potentially wasting spare resources of faster agents. To minimize training time in heterogeneous environments, we present a Communication-Efficient Training Workload Balancing for Decentralized Multi-Agent Learning (ComDML), which balances the workload among agents through a decentralized approach. Leveraging local-loss split training, ComDML enables parallel updates, where slower agents offload part of their workload to faster agents. To minimize the overall training time, ComDML optimizes the workload balancing by jointly considering the communication and computation capacities of agents, which hinges upon integer programming. A dynamic decentralized pairing scheduler is developed to efficiently pair agents and determine optimal offloading amounts. We prove that in ComDML, both slower and faster agents' models converge, for convex and non-convex functions. Furthermore, extensive experimental results on popular datasets (CIFAR-10, CIFAR-100, and CINIC-10) and their non-I.I.D. variants, with large models such as ResNet-56 and ResNet-110, demonstrate that ComDML can significantly reduce the overall training time while maintaining model accuracy, compared to state-of-the-art methods. ComDML demonstrates robustness in heterogeneous environments, and privacy measures can be seamlessly integrated for enhanced data protection.
Spatially temporally distributed informative path planning for multi-robot systems
Nguyen, Binh, Nguyen, Linh, Nghiem, Truong X., La, Hung, Baca, Jose, Rangel, Pablo, Montoya, Miguel Cid, Nguyen, Thang
This paper investigates the problem of informative path planning for a mobile robotic sensor network in spatially temporally distributed mapping. The robots are able to gather noisy measurements from an area of interest during their movements to build a Gaussian Process (GP) model of a spatio-temporal field. The model is then utilized to predict the spatio-temporal phenomenon at different points of interest. To spatially and temporally navigate the group of robots so that they can optimally acquire maximal information gains while their connectivity is preserved, we propose a novel multistep prediction informative path planning optimization strategy employing our newly defined local cost functions. By using the dual decomposition method, it is feasible and practical to effectively solve the optimization problem in a distributed manner. The proposed method was validated through synthetic experiments utilizing real-world data sets.
GPT-4's assessment of its performance in a USMLE-based case study
Dhakal, Uttam, Singh, Aniket Kumar, Devkota, Suman, Sapkota, Yogesh, Lamichhane, Bishal, Paudyal, Suprinsa, Dhakal, Chandra
This study investigates GPT-4's assessment of its performance in healthcare applications. A simple prompting technique was used to prompt the LLM with questions taken from the United States Medical Licensing Examination (USMLE) questionnaire and it was tasked to evaluate its confidence score before posing the question and after asking the question. The questionnaire was categorized into two groups-questions with feedback (WF) and questions with no feedback(NF) post-question. The model was asked to provide absolute and relative confidence scores before and after each question. The experimental findings were analyzed using statistical tools to study the variability of confidence in WF and NF groups. Additionally, a sequential analysis was conducted to observe the performance variation for the WF and NF groups. Results indicate that feedback influences relative confidence but doesn't consistently increase or decrease it. Understanding the performance of LLM is paramount in exploring its utility in sensitive areas like healthcare. This study contributes to the ongoing discourse on the reliability of AI, particularly of LLMs like GPT-4, within healthcare, offering insights into how feedback mechanisms might be optimized to enhance AI-assisted medical education and decision support.
WIP: Development of a Student-Centered Personalized Learning Framework to Advance Undergraduate Robotics Education
Shill, Ponkoj Chandra, Wu, Rui, Jamali, Hossein, Hutchins, Bryan, Dascalu, Sergiu, Harris, Frederick C., Feil-Seifer, David
This paper presents a work-in-progress on a learn-ing system that will provide robotics students with a personalized learning environment. This addresses both the scarcity of skilled robotics instructors, particularly in community colleges and the expensive demand for training equipment. The study of robotics at the college level represents a wide range of interests, experiences, and aims. This project works to provide students the flexibility to adapt their learning to their own goals and prior experience. We are developing a system to enable robotics instruction through a web-based interface that is compatible with less expensive hardware. Therefore, the free distribution of teaching materials will empower educators. This project has the potential to increase the number of robotics courses offered at both two- and four-year schools and universities. The course materials are being designed with small units and a hierarchical dependency tree in mind; students will be able to customize their course of study based on the robotics skills they have already mastered. We present an evaluation of a five module mini-course in robotics. Students indicated that they had a positive experience with the online content. They also scored the experience highly on relatedness, mastery, and autonomy perspectives, demonstrating strong motivation potential for this approach.